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ALLELIC VARIATION IN HUMAN 
GENE EXPRESSION 

[Oil The U.S. goveniment retains certain rights in the invention by virtue of its support of 
the underlying work involved in making the invention, and the terms of grants from 
the National Institutes of Health grants CA57345, CA 62924 and CA43460. 

BACKGROUND OF THE INVENTION 

Field of the Invention 

[02] The invention relates to the field of diagnostic and prognostic testing. In particular it 
relates to detecting variations in gene expression between individuals in a population 
that may indicate disease susceptibility , or predict the phenotype of traits deemed 
within normal variation. 

Background of the Prior Art 

[03] Understanding the genetic basis of human variation is one of the most important 
goals of modem biomedical research. Much work in this area is focused on 
genetic polymorphisms associated with struaural alterations of the encoded 
proteins. However, studies in other organisms suggest that such protein 
polymorphisms account for only a fraaion of normal variation and that 
differences in gene expression levels account for a ma|or part of the variation 
within and among species (/, 2). In humans, altered gene expression has not been 
systematically addressed in the context of normal human variation. 

[04] There is a need In the art for techniques for assessing variation in gene 
expression and for associating such variations with disease states and disease 
susceptibility. 
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BRIEF SUMMARY OF THE INVENTION 

[05] In a first embodiment of the invention a method of associating a genotype with a 
phenotype is provided. Levels of expression of an allele of a gene in a first 
population comprising affected individuals are determined. The affected individuals 
share a phenotype. Levels of expression of the allele in a second population 
comprising control individuals are determined. The control individuals do not share 
the phenotype. The levels of expression of the allele in the first and the second 
populations are compared. An allele whose expression differs in a statistically 
significant manner between the first and the second populations is identified as having 
an association with the phenotype. 

[06] In a second embodiment of the invention a method is provided for measuring allelic 
expression variation in a non-imprinted gene in an individual. Messenger RNA 
(mRNA) from an individual heterozygous for a single nucleotide polymorphism 
(SNP) in a non-imprinted gene is reverse transcribed and amplified to form first 
cDNA fi-om a first allele and second cDNA from a second allele. Primers are 
hybridized to the first cDNA and the second cDNA. Those primers hybridized to the 
first cDNA and the second cDNA are differentially labeled to form differentially 
labeled first and second primers. The amount of differentially labeled first primers is 
compared to the amount of differentially labeled second primers. A statistically 
significant difference between the amount of labeled first primers and the amount of 
labeled secpnd primers indicates that the first and second alleles are differentially 
expressed in the first individual. 

[07] In a third embodiment, a method is provided for measuring allelic expression 
variation in a non-imprinted gene in an individual. Messenger KNA (mRNA) firom an 
individual heterozygous for a single nucleotide polymorphism (SNP) in a non- 
imprinted gene is reverse transcribed and amplified to form first cDNA fi-om a first 
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allele and second cDNA from a second allele. Primers are hybridized to first cDNA 
and second cDNA. Those primers hybridized to the first cDNA are differentially 
labeled from those hybridized to the second cDNA using fluorescent dye terminators 
and a single base extension reaction to form differentially labeled first and second 
primers. The amount of differentially labeled first primers is compared to the amount 
of differentially labeled second primers using capillary electrophoresis. A statistically 
significant difference in the amount of labeled first primers from the amount of 
labeled second primers indicates that the first and second alleles are differentially 
expressed in the individual. 

[081 In a fourth embodiment of the invention a method is provided for measuring allelic 
expression variation in a non-imprinted gene in a first individual. Level of expression 
of an allele of a gene in a first individual di^laying a phenotype is detemiined, as is 
the level of expression of the allele in a population of control individuals. The control 
individuals do not display the phenotype. Level of expression of the allele in the first 
individual is compared to level of expression in the population of control individuals. 
A statistically significant difference in the levels of expression indicates that the allele 
in the first individual may be associated with the phenotype. 

[09] These and other embodiments of the invention provide the art with an additional 
dimension for assessing genetic diversity. 

BRIEF DESCRIPTION OF THE DRAWINGS 

[10] Fig. 1 shows a schematic of assay for fractional allelic expression showing key steps. 
See text for additional details. 

[11] Fig. 2 shows the result of allelic expression analyses performed as described below in 
note (3). Representative results are shown for eight genes. The shaded box represents 
approximated 95% confidence interval and red bars indicate individuals displaying 
significant variations, as defined in note (S). 
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[12] Fig. 3 shows examples of two kindreds exhibiting Mendelian inheritance patterns 
In either the PKD2 or Calpain-IO gene. Only individuals who were heterozygous 
for the SNP or were used to deduce haplotypes are shown. The individuals 
displaying altered fraaional allelic expression are shaded red, and the individuals 
originally found to display altered expression are indicated by arrows. An 
obligate carrier in the PKD2 pedigree who could not be scored is indicated with 
a red dot. The results of genotype analyses are shown directly above each 
member of the pedigrees. The markers employed are listed at the right and each 
allele observed in a family was assigned a number. Markers suggesting a 
recombination are underlined and the allele associated with altered expression Is 
indicated in red. The fraaional allelic expression data used to score the pedigree 
are shown above the genotype and were interpreted as described in the legend 
to Figure 2. 

DETAILED DESCMPTION OF THE INVENTION 

[13] We have here developed methods to quantitatively evaluate allelic variation in 
gene expression and applied them to the analysis of 13 different genes. We found 
allelic variation in expression levels in six of these genes, and showed that these 
variations were often heritable. The results suggest that genetically-determined 
variation in expression levels is an important component of human diversity and 
have significant implications for normal and abnormal human physiology. 

[14] Phenotypes which can be assessed according to the present invention are those 
which relate to disease as well as those which relate to normal human 
physiology. Examples of phenotypes include disease susceptibility, birth defeas, 
psychological parameters, learning parameters, and physical charaaeristics. The 
phenotype is preferably a polymorphic phenotype, /.e., many forms of the 
charaaeristic exist. Individuals who share a particular phenotype are grouped 
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together and are termed "affeaed Individuals^ for purposes of this invention- 
Individuals who do not share the particular phenotype are used to form a 
control population. 

[15] Levels of expression of an allele can be determined using any techniques which 
are known in the art Such techniques include but are not limited to allele- 
specific expression assays, oligonucleotide llgase assays, and dideoxy single-base 
extension of an unlabeled oligonucleotide primer, desaibed In more detail 
below. Any technique an be used that can distinguish between expression 
produas of alleles. The level of expression of a single allele of a gene can be 
determined in isolation, without comparing expression to the second allele 
present in an Individual. Alternatively, the level of expression of one allele of a 
gene in an Individual can be compared to the level of a second allele of the gene 
in the individual. 

[16] Levels of expression are compared to determine statistically significant differences. 
Any statistical analysis can be used which determines such differences. One 
particular analysis which can be used is the MIXED procedure of the SAS system 
version 8.0 for repeated measurements. A statistically significant difference can be a 
5 % difference, a 10 % difference, a 15 % difference, a 20 % difference, a 25 % 
difference, or more. 

[17] Haplotypes that are associated with an altered level of expression of an allele can be 
determined. The haplotypes can be used as surrogates for the altered level of 
expression. The haplotypes can be used to follow the altered expression levels either 
within a population or within a family. 

[18] Variations in expression can be determined to be heritable if they are determined in 
related individuals, such as parents and offspring. If the variation in expression is 
detennined to be consistently inherited along with at least two adjacent microsatellite 
markers, for example, then the variation is indicated to be heritable. 
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[191 A heritable variation in expression levels can be studied to determine any changes in 
sequence which might account for the expression alteration. Such changes are likely 
to be located in control regions such as the promoter, although they can occur 
elsewhere. The changes can be subtle, single base pair changes or they can be 
insertions or deletions. Such changes can be determined by mapping and^or 
sequencing or other techniques known in the art for determining genetic changes. 

[20] While the invention has been described with respect to specific examples including 
presently preferred modes of carrying out the invention, those skilled in the art will 
appreciate that there are numerous variations and permutations of the above described 
systems and techniques that fall within the spirit and scope of the invention as set 
forth in the appended claims. 

Example 

[21] The analysis of variation of gene expression is complicated by the expeaed 
magnitude of the differences; complete loss of expression from one allele results 
in a reduction of total expression levels of only 50%. However, comparing 
expression of one allele to the other can greatly facilitate the deteaion of such 
differences. Importantly, such comparisons ensure that the alleles are both 
expressed within the identical Intracellular environment and are independent of 
environmental faaors. To make these comparisons, we studied RT-PCR products 
derived from the mRNA of normal Indhriduals who were heterozygous for SNPs 
within the studied transcripts (Fig. lA). The PGR products derived from each 
allele were then distinguished using differentially labeled fluorescent dideoxy 
terminators In single nucleotide extensions. The produas were quantified by 
capillary gel elearophoresis and reproducibility was ensured by the analysis of 
seven replicates of each sample (Fig. 1A). 
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[22] We applied this approach to lymphobiastoid cells derived from 96 normal 
individuals from CEPH reference families {J). To validate our approach, we first 
examined allelic expression of the APC tumor suppressor gene {APQ in CEPH 
individuals and in an FAP patient previously shown to have decreased expression 
of one allele (4). No significant variation in fractional allelic expression was 
observed in any of 17 heterozygous CEPH individuals tested (5). In contrast, 
unequal allelic expression was detectable in the FAP patient (Fig. IB). Based on 
these and other control analyses, we estimate that we were able to confidently 
identify variation when the differences between expression of the two alleles 
differed by more than 20% {6}. 

[23] We next examined variation in 12 additional genes containing relatively common 
SNPs [Table 1). For each gene, we first studied genomic DNA to determine which 
of the 96 individuals were heterozygous at these loci, and identified on average 
23 heterozygous individuals for further study. Significant differences in allelic 
variation were observed in 6 of these 12 genes. The fraaion of patients exhibiting 
variation in allelic expression ranged from 3% (one of 37 individuals tested for 
Catalase) to 30 % (six of 20 individuals tested for p73) (Table 1 and Fig. IB). In 
those individuals whose alleles were differentially expressed, the ratio of 
transcripts varied from 1.3:1 (fUNO to 4.3:1 (p7J). 



[24] Given that these variations were each observed in a minority of individuals, it is 
unlikely they were due to genetic imprinting. It was not possible to determine if 
the altered expression was due to increased or decreased expression of the rare 
allele from these analyses. 

[25] To detennine whether the variations were heritable, we examined the families of 
nine individuals exhibiting allelic variation in the assays described above. Six of 
these families proved uninformatlve (7). The other three families were 
Informative and each displayed a pattern of expression fully consistent with 
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Mendelian inheritance. These included two families with allelic variation of 
Calpain-lO expression and one family with allelic variation of PKD2 expression 
(examples in Fig. IC). In each of the families, the altered expression was found to 
be consistently inherited with a single haplotype defined by at least two adjacent 
microsatellite markers. Moreover, it was possible to deduce the nature of the 
altered allelic expression from these family studies. In the case of PKD2, the 
altered allelic expression was due to increased expression of the affeaed allele 
whereas in both Calpain-lO families, it was due to decreased expression of the 
affeaed allele. 

[26] These findings provide strong evidence that electing. Inherited variations in 
gene expression are relatively common among normal individuals. In this r^ard. 
It is important to note that our measurements likely represent an underestimate 
of such differences in gene expression as they were derived from a single cell 
type and additional variations in allelic expression may manifest in a cell-type 
specific manner. 

[27] While we have focused on normal differences in allelic expression in this study, 
our results have obvious Implications for disease susceptibility. They suggest an 
approach for connecting genotype to phenotype in which the expression levels 
of genes are measured in patients and compared to controls. This strategy would 
have two clear advantages over methods based on linkage as commonly used in 
assodation, sib-pair, and related studies {8,9). First, any expression differences 
noted would provide direa evidence for the implicated gene's causal role, while 
linkage data can at best implicate that some gene in the linked region Is 
responsible for the phenotype. Second, expression data are independent of 
population structure and do not rely on the absence of recombination between 
the marker and the responsible gene. We anticipate that the approach described 
above or other methods for measuring allelic variation in gene expression will 
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play a major role in defining normal human variation and disease suscepiioiiicy 
in the future. 
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Table 1 - Allelic Variation in Gene Expression 
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